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ABSTRACT 



Aims. We aim to study the large-scale structure of an extragalactic serendipitous X-ray survey with unprecedented accuracy thanks 
to the large statistics involved, and provide insight into the environment of AGN at the epochs when their space density declines 
(z ~ 1 - 2). 

Methods. In this paper we present the two-point angular correlation function of the X-ray source population of 1063 XMM-Newton 
observations at high Galactic latitudes, comprising up to ~30000 sources over a sky area of ~ 125.5 deg 2 , in three energy bands: 
0.5-2 (soft), 2-10 (hard), and 4.5-10 (ultrahard) keV. This is the largest survey of serendipitous X-ray sources ever used for clustering 
analysis. 

Results. We have measured the angular clustering of our survey and find significant positive clustering signals in the soft and hard 
bands (~10cr and ~5<x, respectively), and a marginal clustering detection in the ultrahard band (<lcr). We find dependency of the 
clustering strength on the flux limit and no significant differences in the clustering properties between sources with high hardness 
ratios (and therefore likely to be obscured AGN) and those with low hardness ratios. We deprojected the angular clustering parameters 
via Limber's equation to compute their typical spatial lengths. From that we have inferred that AGN at redshifts of ~ 1 are embedded 
in dark matter haloes with typical masses of (log M DM h) — 12.60 ± 0.34 hr l M Q and lifetimes of t A oN = 3.1- 4.5 x 10 s yr. 
Conclusions. Our results show that obscured and unobscured objects share similar clustering properties and therefore they both reside 
in similar environments, in agreement with the unified model of AGN. The short AGN lifetimes derived suggest that AGN activity 
might be a transient phase that can be experienced several times by a large fraction of galaxies throughout their lives. 

Key words. Surveys - X-rays: general - (Cosmology:) observations - galaxies: active 



1. Introduction 

Active galactic nuclei (AGN) are the brightest persistent ex- 
tragalactic sources known, with their X-ray emission the most 
common feature among them. Thanks to their large bolometric 
output, AGN can be detected through cosmological distances, 
which makes them essential tracers of galaxy formation and 
evolution, as well as the large-scale structure of the Universe. 
Clustering studies of AGN at redshift ~1, when strong structure 
formation processes took place, are key tools for understanding 
the underlying mass distribution and evolution of cosmic struc- 
tures (Hartwick & Schade [T990l ). 

Moreover, a number of works suggest that the most lumi- 
nous AGN were formed earlier in the Universe, while the less 
luminous ones were formed later (Ueda et al. 2003, Hasinger et 
al. I2005I Barger et al. I2005 , La Franca et al. I20051 Silverman 
et al. 2008 Ebrero et al. I2009I I. Understanding the large-scale 
clustering of AGN will therefore provide more clues to the en- 
vironment of AGN at these epochs and to how it is linked to the 
formation and accretion of matter onto the central supermassive 
black hole, since it is thought to be triggered by major mergers or 
close interactions between galaxies (i.e. Kauffmann & Haehnelt 
2000 Cavaliere & Vitori ni 120021 Menci et al. 120041 Di Matteo 
et al. 120531 Granato et al. [2006t . 
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The simplest way to measure clustering is the two-point an- 
gular correlation function, which measures the excess probabil- 
ity of finding a pair of sources at a given angular distance. There 
have been multiple determinations of the angular correlation 
function of both optically and X-ray selected AGN (Vikhlinin & 
Form an[T995l Akylas et al. 120001 G iacconi et al.[2001 , Basilakos 
et al. 2004 Basilakos et al. 2005 Gandhi et al. 120061 Carrera 
et al. 120071 Miyaji et al. 120071 Ueda et al. [2008b . These works 
have shown that AGN detected in soft X-rays (0.5-2 keV) tend to 
cluster strongly. Nevertheless, at higher energies many of these 
results were inconclusive because of the small-number statistics, 
ranging from marginally significant to no clustering detections at 
all. 

The angular correlation function, however, only measures 
overdensities projected in the sky thus blurring the underly- 
ing spatial structure. With larger and more accurate spectro- 
scopic identification campaigns of AGN, an increasing number 
of works have calculated the spatial clustering of these sources 
(e.g. Mullis et al. 120041 Gilli et al. 120051 Yang et al. 120061 Gilli 
et al. l2009t . The clustering properties derived from these deter- 
minations have been used to evaluate the evolution of AGN clus- 
tering with redshift, and the mass of dark matter haloes (DMH) 
in the context of a cold dark matter (CDM) scenario, in which 
DMH of different mass cluster differently. 

To accurately measure the angular correlation function, a 
sample that achieves both width (to prevent biases caused by 
a single structure) and depth (to ensure high angular density) 



(0 o 

J? " 



, Ebrero et al.: High-precision multi-band measurements of the angular clustering of X-ray sources 

Table 1. Sample summary. 
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Fig. 1. Sky area of the survey as a function of flux in the soft 
(solid line), hard (dashed line) and ultrahard (dot-dashed line) 
bands. 

is needed. Moreover, X-ray selected samples are less biased 
against obscuration than optically selected samples, especially 
in the 2-10 keV energy range, and are therefore an ideal resource 
for large-scale structure and evolutionary analysis. In this paper 
we use 1063 XMM-Newton observations at high Galactic lati- 
tudes over a sky area of 125 deg 2 (Mateos et al. 2008 ) to compute 
the angular correlation function of serendipitous X-ray sources 
in three energy bands: 0.5-2 keV, 2-10 keV and 4.5-10 keV. The 
sample comprises over ~30,000 sources, thus being the largest 
compiled sample to investigate clustering so far. 

This paper is organised as follows: in section|2]we overview 
the sample used in this work and describe its general properties. 
In section[3]we perform a crude analysis similar to the traditional 
counts-in-cells methods as a preliminary test for clustering. In 
section|4]we undertake the calculation of the angular correlation 
function of our sample in different energy bands, while in sec- 
tion we study the deprojection to the real space of our results 
via the inversion of Limber's equation. Finally, our conclusions 
are reported in section [6] 

Throughout this paper we have assumed a cosmological 
framework with Ho = 70 km s Mpc~' , Qm = 0.3 and Qa = 0.7 
(Spergel at al. 120051 

2. The X-ray data 



In this work we use the sample presented in Mateos et al. (2008). 
The observations are a subset of those employed in produc- 
ing the second XMM-Newton serendipitous source catalogue, 
2XMlC] (Watson et al. |20091 >. 2XMM is composed by observations 
from the European Photo Imaging Cameras (EPIC) on board 
XMM-Newton, although this sample was built using data from 
the EPIC-pn camera only (Turner et al. 20QT[). 

The selected observations fulfill the following criteria: 

1. High galactic latitude fields (\b\ > 20°) in order to minimize 
the contamination from Galactic sources. 

2. Fields with at least 5 ks of clean exposure time. 

3. Fields free of bright and/or extended X-ray sources. 

If there were observations carried out at the same sky posi- 
tion, the overlapping area from the observation with the shortest 



Band (keV) 


N obs a 


N b 

1 ' sou 


Flux limit 


Area 








(erg cirT 2 s~') 


(deg 2 ) 


Soft (0.5-2) 


1063 


31288 


1.4xl0- 15 


125.52 


Hard (2-10) 


1063 


9188 


9.2X10" 15 


125.52 


Ultrahard (4.5- 10) c 


432 


1259 


1.4xl0" 14 


51.47 
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8 Number of XMM-Newton observations used 

b Number X-ray sources used 

c For observations with exposure times >20 ks only 



clean exposure time was removed. The resulting sample com- 
prised 1 129 observations. For the purposes of this work we have 
also removed the observations belonging to the Virgo Cluster, 
M31, M33, Large Magellanic Cloud and Small Magellanic 
Cloud fields, ending up with a final sample of 1063 observations. 

The analysis presented in this paper has been carried out in 
the following energy bands: 

- Soft: 0.5-2 keV 

- Hard: 2-10 keV. 

- Ultrahard: 4.5-10 keV 

Sources detected in each energy band have a minimum de- 
tection likelihood of 15 (which roughly corresponds to a 5cr sig- 
nificance of detection, Cash 1979 ) and fluxes < 10~ 12 erg cm -2 
s . The targets of the observations were removed. 

The sky coverage as a function of the X-ray flux was cal- 
culated using an empiral approach, computing a sensitivity map 
for each observation that provides the minimum count rate re- 
quired for a source to be detected at each position in the field 
of view, taking into account the local effective exposure and the 
background level. The procedure is described in detail in Carrera 
et al. d20071 > and Mateos et al. (2008). Sources with actual count 
rates below the sensitivity map value at their position were there- 
fore excluded from the analysis for consistency with the sky area 
calculations. The fraction of sources removed this way is less 
than 4% in the 0.5-2 keV band, -5% in the 2-10 keV ba nd, and 
-7% in the 4.5-10keV band. Similarly as in Mateos et al. (2008), 
we have also only considered sources that were detectable over 
a minimum area of f2,„,„ = 1 deg 2 in order to avoid uncertainties 
due to the low count statistics, or inaccuracy in the sky cover- 
age calculation at the very faint detection limits. Less than 0.5% 
of the sources in the soft and hard bands were removed this 
way, while in the ultrahard band this fraction raises to ~1.5%. 
Changes in the flux limits of the survey were negligible. 

Since the density of sources per field is a key issue to the 
study of angular clustering (a low source density can lead to a 
no-clustering signal even if there is an actual clustering present, 
see Carrera et al. 120071 1. we have removed the fields with expo- 
sure times below 20 ks from the analysis in the ultrahard band. 
This way we enhance the source density in this band from ~1 to 
~4 sources per field (still far from the source density in the soft 
and hard bands, ~30 and ~ 10 sources per field, respectively), for 
a total of 432 fields. 

Hence, the overall sky coverage of the sample is 125.52 deg 2 
comprising 31288 and 9188 sources in the softand hard bands, 
respectively. The sky area covered by the >20 ks ultrahard sam- 
ple is ~5 1 .5 deg 2 for a total of 1259 sources (see Figure[T|). The 
properties of the sample are summarised in TableQ] 
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Fig. 2. Source counts of the real {triangles) and simulated sam- 
ples (dots) in the soft (upper panel) and hard (lower panel) 
bands. 



3. Cosmic variance 

As a preliminary test for source clustering, we have compared 
the actual number of sources detected in each field Nk with the 
number of expected sources A k obtained from the best fit log N - 
log S of Mateos et al. ([2008). This is similar to the traditional 
count-in-cells method. If there is a cosmic structure present in a 
few (or most of the) fields, the number of sources detected would 
be significantly different compared to that of a random uniform 
distribution as measured by the overall log N - log S . 

For this purpose we have used the cumulative Poisson distri- 
butions 



P,i k (>N k ) = Z? =Nk PA k (l) ,N k >A k 
P,i k (<N k ) = Z£)^.(0 ,N k <A k , 



(1) 



where Pj. k (l) is the Poisson probability of detecting I sources 
when the expected number of sources is A. The likelihood statis- 
tics for the whole sample is hence 



L' = -2 V* In Pt k (> N k ) ■ 



Carrera et al. ( 1998) which uses Pa(1) instead of the cumulative 
probabilities. 

The observed L' values thus obtained were compared to 
1000000 simulated likelihood values calculated using A k and 
Poisson statistics, finding that the number of simulations with 
likelihood values above the observed ones were 0, 6522 and 
703713 for the soft, hard and ultrahard bands, respectively. It 
can be seen that there are significant deviations from the ex- 
pected number of sources from a random uniform distribution 
in the soft and hard bands. 

The fact that all the simulations in the soft band show better 
likelihoods than our sample indicates that we can set up a lower 
limit for the significance of the deviation (i.e. an evidence for 
clustering) of at least >6cr, whereas in the hard band the devia- 
tion is of the order of ~3cr. The small number of simulations in 
the ultrahard band with likelihoods better than the observed ones 
suggests that no significant deviations (well below l<x) from the 
random distribution were found, probably due to the large sta- 
tistical noise (Stewart et al., in preparation). However, these re- 
sults point out in the direction that a cosmic large-scale structure 
might be present in most of the observations, therefore setting 
ground to more ellaborate clustering analysis. 

4. The angular correlation function 

The two-point angular correlation function w(6) determines the 
joint probability of finding two objects in two small angular re- 
gions 5Q.\ and 5Q.2 separated by an angular distance with re- 
spect to that of a random distribution (Peebles 1980), 



6P = n 2 5Cli5Cl 2 [\ +w(0)], 



(3) 



where n is the mean sky density of objects. If w(6) = the dis- 
tribution is homogeneous. 

The angular separation is a projection in the sky of the actual 
spatial separation between two sources at different redshifts thus 
somewhat blurring the true underlying spatial clustering, which 
needs accurate and highly complete redshift determinations to 
be properly measured. The angular correlation function is, nev- 
ertheless, a powerful approach given the large size of the present 
two-dimensional extragalactic sample. 

The spatial clustering, however, can be estimated by depro- 
jecting the computed angular correlation function assuming a 
given redshift distribution (which can be either empirically mea- 
sured or derived from a luminosity function model) via the in- 
version of Limber's equation (see section IBTTl i. 

4.1. Method 

To calculate the angular correlation function we have used the 
estimator proposed by Landy & Szalay (19931 : 



w(6i) = 



DD - 2DR + RR 



RR 



(4) 



k 



2^lnP, h ,(<N k ,), 



(2) 



where DD, DR and RR are the normalised number of pairs of 
sources in the z'-th angular bin for the Data-Data, Data-Random 
and Random-Random samples, respectively. DD, DR and RR are 
normalised dividing them by the total number of pairs in the 
sample: 



where k runs over the fields for which N k > A k and k' runs over 
the fields for which < A\. T his procedure is the same as the 
one used in Carrera et al. (2007), and differs slightly from that of 



f DD = N D (N D - l)/2 

fDR = N D N R 

/rr = N R (N R - l)/2. 



(5) 



4 J. Ebrero et al.: High-precision multi-band measurements of the angular clustering of X-ray sources 



0.5-2 keV 




d I— L- — ' ■ ■ 1 ' ' ' ' ' — ■ ■ ■ 1 o I— L — J ■ 1 1 1 1— ' ' 1 ■ 1 ■ ' — ■! 1 ■ 1 

150 100 300 500 1000 -50 100 200 500 1000 

6 (arcsec) 6 (arcsec) 



Fig. 3. Angular correlation function in the soft (Top panels), hard (Centre panels) and ultrahard (Bottom panels) bands. Left hand 
panels are represented in linear-log scales, while right hand panels are represented in log-log scales. Solid dots are the observed 
data. Solid triangles represent the average random estimation of w(6) used to calculate the integral constraint (see text). Overplotted 
is the best-fit^' 2 with and without fixed slope (dashed and solid lines, respectively). 



To produce the random source sample against which we have 
compared the real source sample searching for overdensities at 
different angular distances, we have tried to mimic as closely 
as possible the real distribution of the detection sensitivity of 
the survey. The method employed here is similar to that used in 
Carrera et al. {2007) and can be summarised as follows: 



1 . We formed a pool that included all the real sources with de- 
tection likelihoods >15 in the band under study, irrespective 
of whether their count rates were above the sensitivity map 
of their corresponding field at the source position or not. 

2. For each field, we extracted sources at random from this pool 
keeping their original count rates and distances to the optical 



J. Ebrero et al.: High-precision multi-band measurements of the angular clustering of X-ray sources 



5 



Table 2. Summary of the fits. 



where M 1 is the inverse of the covariance matrix (equation^, 
and A is a vector such as 



Band (keV) 


■Af SOU 


6o 

(arcsec) 


7-1 


X 2 /d.o.f. 


Soft (0.5-2) 


31288 


22.9±2.0 
7.7+0.1 


1.12±0.04 

0.8 (fixed) 


8.4 
12.3 


Hard (2-10) 


9188 


29.2+5' 
5.9+0.3 


i.jj_ 0A1 
0.8 (fixed) 


1.9 
3.0 


Ultrahard (4.5-10) 


1259 


7.4+1.4 


i 47+0.43 
1 ' -0.57 

0.8 (fixed) 


0.5 
0.6 



axis of the X-ray telescope but randomizing their azimuthal 
angle around it. 

3. If the source had a count rate above the sensitivity map of the 
field under consideration at the new position, it was kept in 
the random sample. Otherwise, the source was discarded and 
a new source was drawn from the pool until the number of 
valid simulated sources matched the number of real detected 
sources in the field. 

This method allowed us to reproduce the decline of the de- 
tection sensitivity with the off-axis angle in the simulated sam- 
ple. We have performed 100 simulations of the whole sample in 
each band this way. For comparison, we have plotted in Figure|2] 
the log N - log S relations for the real and random samples in the 
soft and hard bands, respectively. Both curves match very well 
along the entire flux range except at fluxes close to the flux limit 
of the survey, where the source counts of the simulated sam- 
ple slightly underestimates the source counts of the real sam- 
ple. However, the overall agreement is excellent and therefore 
the above method provides a good random sample for clustering 
analysis purposes. 

The errors in different angular bins are not independent from 
one another. To estimate the errors we have followed the method 
described in Miyaji et al. d20071 l who computed the covariance 
matrix in the form 



Mij = Z£? K($) - (w R (ed)] [w* (6j) - (w R (6j))] /N„ 



x[l+w(9 i )] 1/2 [l+w(9 j )] 



1/2 



(6) 



where w k R (9j) is the angular correlation function for the k-th sim- 
ulation in the i-th angular bin, (vvr(S,)) is their mean value, 
is the actual value for the angular correlation function computed 
from equation|4]and N S j„, is the total number of simulations. The 
square root of the diagonal elements M„ are hence the scaled 
errors for each bin in the angular correlation function. 

4.2. Fit to an analytical model 

The angular correlation function calculated in section l4~T1 can be 
described by a power-law model in the form 



W mo del(ff) = I ~ 



1-7 



(7) 



where 1 - y is the slope and Go is the angular correlation length. 

We have fitted the data using a x 2 technique. The fits were 
carried out over the range in which we had positive signal (50- 
1000 arcsec). Similarly as in Miyaji et al. (2007), in order to 
take into account the correlations between errors computed in 
section l4~T1 we have minimised the expression 

X 2 = A r M _1 A, (8) 



w(9i) - w mode i(6i) + IC. 



(9) 



IC is a constant that accounts for the integral constraint, which 
is a bias in the correlation function that occurs when a positive 
correlation is present at angular scales comparable to the indi- 
vidual field size. The mean surface density of objects from the 
survey is therefore too high thus producing a negative bias in the 
angular correlation function (Basilakos et al. 2004). The integral 
constraint can be formally calculated by 



IC = 



J J d£lidQ. 2 w(0) 
J JdQ.idn 2 



(10) 



where the integrals are carried out over the whole area of the 
survey. However, since the dependence of the sensitivity on the 
area of the survey is rather complicated we have estimated IC 
empirically. For this, we have calculated the angular correlation 
function in the absence of correlation via the average of N s t m 
realisations of w(ff) in which we have replaced the real data by 
random samples simulated independently using the method de- 
scribed in section 14.11 (Carrera et al. 2007). The values of IC 
thus obtained are small (~ 9.5 x 10~ 3 ) but not negligible, and 
ignoring them can produce underestimations in the strength of 
the clustering signal. 

4.3. Results 

In this section we present the results obtained after fitting the 
overall samples to the power-law model described in section l4~2l 
in each energy band. The best-fit parameters are summarised in 
Table [2] along with the number of sources involved. These val- 
ues correspond to the fits in the 50-1000 arcsec range applying 
the integral constraint correction. The reported errors are lcr. Fit 
results are plotted along with the observed binned angular corre- 
lation function in Figure [3] 

In the soft band (0.5-2 keV) we detect a high-significance 
(~10cr) clustering signal with a correlation length of Qq - 22.9 + 
2.0 arcsec and a slope ofy-1 = 1.12 ± 0.04 after correcting for 
the integral constraint. If we ignore that correction in our fits, the 
correlation lengths are comparable within the error bars but the 
power-law becomes significantly steeper (y - 1 = 1.29 + 0.04). 
Similar values for the slope y - 1 in this band were found by 
Gandhi et al. (2006) and Carrera et al. (2007), who obtained 
y — 1 = 1.2 albeit with larger (up to a factor 10) error bars. Our 
best-fit correlation length #o is consistent with that of Carrera et 
al. (|2007) (9o = 19^ arcsec) within the error bars, but it is signif- 
icantly larger than that of Gandhi et al. (2006) (9 = 6.3 + 3.0 arc- 
sec). In general, our results without applying the integral con- 
straint correction show slightly larger correlation lengths and 
steeper slopes, although generally consistent with the IC cor- 
rected results within the error bars. 

For comparison with other works that reported their results 
with a fixed y - 1 we have also made the fits fixing the slope to 
the canonical value of y— 1 = 0.8 (e.g. Peebles [1980l l. The corre- 
lation length drops to 9q — 7.7 ±0.1 arcsec then, which is in good 
agreement with Puccetti et al. (2006) (9q -5.2 + 3.8 arcsec) and 
Carrera et al. (2007 ) (#o = 6 + 2 arcsec) although our parame- 
ters are much more constrained thanks to the large statistics. Our 
result is smaller (by 2<x) than that computed by Basilakos et al. 
(|2003b using the XMM-Newton/2dF sur vey (9 = 10.4 ± 1 .9 arc- 
sec). On the other hand, Ueda et al. (2008) found a correla- 
tion length of 9q = 5.9*ln arcsec in the SXDS survey, smaller 
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Fig. 4. Best-fit correlation length (for y — 1 = 0.8) as a function of the flux limit of the sample (at zero area) in the soft (Left panel) 
and hard (Right panel) bands. Overplotted is the best fit to all points (solid line), our sample only (dashed line), and the rest of the 
surveys excluding our data points (dot-dashed line). The points belonging to our sample are not independent from each other (i.e. 
the sources are used cumulatively above the given flux limit). 



Table 3. Summary of the fits in different flux-limited subsam- 
ples. 



Band (keV) 


Flux limit 

(erg cirT 2 s _1 ) 


N 

1 ' sou 


do 

(arcsec) 


7-1 


Soft (0.5-2) 


5 x 10~' 5 


23264 


27.3+^ 
8.7±0.'l 


1 16+ 005 
0.8 (fixed) 


Soft (0.5-2) 


1 x 10" 14 


12046 


17 1+5.0 
Jl.j_ s4 

9.9±0.2 


1 oq+0.08 

0.8 (fixed) 


Soft (0.5-2) 


4x 10~ 14 


1346 


87.8^ 4 
7.6+1.5 


o 14+031 
z - J ^-0.34 

0.8 (fixed) 


Hard (2-10) 


3 x 10~ 14 


4790 


23.5!!^ 
8-3°J, 


1,1 1 -024 

0.8 (fixed) 


Hard (2-10) 


6x 10-' 4 


1571 


u.i+\i 


0.8 (fixed) 


Hard (2-10) 


9x 10~ 14 


609 


19.2+^ 


0.8 (fixed) 



than ours by ~2cr, while Miyaji et al. (2007) obtained a much 
weaker clustering strength of 6q = 3.1 ± 0.5 arcsec for sources 
in the COSMOS field. Similarly, Yang et at. (20031 found a 
~2cr clustering signal in Chandra observations of the Lockman 
Hole North-West region with #o = 4 ± 2 arcsec. The samples 
used by Yang et al. (2003) and Miyaji et al. (2007) reach limit- 
ing fluxes one order of magnitude deeper than that of our sam- 
ple. Therefore, their very low correlation lengths compared with 
ours could be explained in terms of a dependence of the clus- 
tering strength on the flux limit of the sample involved (see 
SectiongjUi. 

In the hard band (2-10 keV) the power-law becomes steeper 
(y-1 = 1.33) and the clustering is stronger (0o = 29.2^ ' arcsec) 
although marginally consistent with the results in the soft band 
within the lcr error bars. The clustering detection is still very sig- 
nificant (~5<x). The sources detected in the hard band are less bi- 
ased against absorption, and if the unified model of AGN is cor- 
rect (the obscuration of the nucleus is due to orientation effects 
only) one might not expect significant differences in the clus- 
tering properties of obscured and unobscured sources. However, 
accurate angular clustering measurements in this band have been 
difficult because of the limitations caused by the small-number 



statistics. For instance, Gandhi et al. (2006), Carreraet al. (2007) 
and Ueda et al. (120081) were not able to obtain significant clus- 
tering signal in this band. Basilakos et al. (2004) found results 
consistent with ours (y - 1 = 1.2 ±0.3 and 9q = 48.9^^ arcsec) 
but with much larger error bars. Puccetti et al. (2006) are also in 
agreement with us (9o = 12.8 + 7.8 arcsec) within their large un- 
certainties. For a canonical slope of 0.8, Yang et al. (2003) found 
9o = 40+ 1 1 arcsec, much larger than ours (9q = 5.9 + 0.3 arcsec) 
with a significance of ~4<x. Such a strong clustering is somewhat 
surprising and it is much higher than any other reported angular 
clustering analysis in this band using a fixed canonical slope (al- 
though it is roughly consistent with the result of Basilakos et al. 
2004 for a fixed y - 1=0.8). We would like to stress, however, 
that the characterisation of the angular correlation function us- 
ing a fixed slope might not represent the true clustering of the 
X-ray sources (see the differences between both best-fit curves 
in Figure[3]l. Indeed, the goodness of fit in both the soft and hard 
bands shows that when we allow the slope to float free, the fit to 
our measured w(9) is better than when it is fixed to the canoni- 
cal value of y — 1.8 (see Table [2}. However, we will make use 
of our fixed-slope fit results elsewhere in this paper in order to 
make comparisons with other works that only reported y — 1.8 
results due to the low statistics. 

We have also performed the angular clustering analysis in the 
ultrahard band (4.5-10 keV) obtaining inconclusive results, even 
after using only deeper fields (see Section|2]i. We found marginal 
clustering signal (~l<x) in this band, with the best-fit parameters 
consistent with those of the soft and hard bands (see Table 
and in good agreement with the results of Miyaji et al. (120071 1 . 

4.4. Dependence on the flux limit 

Previous studies on the angular clustering of Chandra Deep Field 
sources using different subsamples suggest that the clustering 
strength might depend on the flux limit of the sample (Giacconi 
et al. 120011 Plionis et al. 2008). Similar results with other sam- 
ples with different depths seem to point out also in this direction. 
We have investigated this behavior in our soft and hard samples 
(where we have enough statistics) by splitting both samples into 
subsamples with different flux limits and then computing their 
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Fig. 5. Best-fit correlation length (for y — 1 = 0.8) as a func- 
tion of the flux limit of the sample in the soft (dots) and hard 
(triangles) bands in different flux bins. The individual points are 
independent from each other. Overplotted is the soft (solid line) 
and hard (dashed line) best fits. 



angular correlation funtion. This means that we are using all 
sources cumulatively above the given flux limit. The results of 
these fits are reported in Table [3] Errors are lcr. 

We can see that if we leave both parameters #o and y — 1 
free, the clustering strength significantly increases and the power 
law becomes steeper as we move towards brighter flux limits in 
the soft band. Something similar is observed in the hard band, 
although in this case we were unable to simultaneously fit both 
parameters in the brightest subsamples due to the low source 
density. 

In order to compare these results with other works we froze 
y— 1 = 0.8. The resulting best-fit correlation length 9q still shows 
a strong dependence with the flux limit of the subsample. In 
Figure|4]we have plotted these points along with the results from 
other surveys such as CDF-N and CDF-S (Plionis et al. 120081) , 
CLASXS (Yang et al. 1200311, COSMOS (Miyaji et al. [2007b , 
XM M-2dF (Basilakos et al. [2001 U SB), E LMS-SI (Puccetti 
et al. 12006b , XMM-LSS (Gandhi et al. 120061 1 and AXIS (Carrera 
et al. |20071 >. In Figure [4] there seems to be a linear trend #o-fux 
in spite of the large uncertainties in most of the surveys. 

We tried to formalize the observed trend by fitting the points 
to a power-law in the form 6q = T(S/Sis) a , where S 15 
I x 10~ 15 erg cirT 2 s _1 and the normalisation T is the expected 
correlation length at S 15. We find that a = 0.15 + 0.01 and 
T = 6.98 + 0.10 arcsec in the soft band, and a = 0.34 + 0.04 
and T = 2.72^" ^ arcsec in the hard band, respectively. 

Although our points match well with the results from other 
surveys within the error bars of the latter, they seem to follow 
a slightly different trend. We therefore fitted our points alone 
obtaining a - 0.11 ± 0.01 and T = 7 .37^^ arcsec, and 
a = 0.35!°-^ and T = 2.59^ arcsec in the soft and hard 
bands, respectively. Fitting the rest of the surveys data points 
and excluding ours provide the following best-fit parameters: 
a = 0.44+Jj ^ and T = 5.55 + 0.39 arcsec in the soft band, and 
a = 0.39+°;^ and T = 2.99+°;^ arcsec in the hard band. 

There are no significant differences between the fits in the 
hard band, all of them consistent with each other. In the soft 
band, however, strong differences arise, our data points suggest- 
ing a much milder dependence on the flux limit than that pre- 



dicted by other surveys. Although our data points do not look 
visually misplaced with respect to the general trend (with the 
exception of the last point at the highest flux limit, which could 
be severely affected by the low statistics), the fits in which they 
are involved are dominated by them due to their much smaller 
error bars. We have also checked that the above results do not 
significantly change by removing the point at the highest flux 
limit, the fits still dominated by the tiny error bars of the rest 
of the points. On the other hand, the Chandra Deep Field data 
points alone suggest a stronger dependence in both energy bands 
with respect to the one we would expect from the other surveys, 
which might be caused by cosmic variance or the so-called am- 
plification bias (caused by the merging of a pair of sources into 
a single source due to the PSF of the detector at small angular 
separations, Vikhlinin & Forman 1995) as discussed in Gilli et 
al. ( 120051 and Plionis et al. (120081 

We have also checked whether the flux limit dependence is 
still significant using independent flux bins. Our data points plot- 
ted in Figure |4] are not independent from each other, since the 
sources above a given flux limit are also present in the brighter 
subsamples. Therefore, we have studied the flux dependence us- 
ing independent data points. As shown in Figure [5] the trend is 
also very clear in this case, being much steeper in the hard band 
than in the soft band (a = 0.67 + 0.09 against a = 0.26 + 0.02) 
thus suggesting that the clustering strength of the sources de- 
tected in the 2-10 keV band is significantly more dependent on 
the flux limit of the subsample. 

There are several possible explanations for the O^-fiux limit 
trend. One of them could be that, since X-ray surveys at differ- 
ent flux limits are, in principle, sampling different populations of 
sources, they have different clustering properties. Deep pencil- 
beam surveys will probe fainter and typically further sources 
than wide-area bright surveys, thus suggesting a possible de- 
pendence of clustering with redshift. Other simple explanation 
could be that pairs of sources in deep surveys (hence with fainter 
flux limits) tend to be separated at smaller angular distances due 
a projection effect. If this is the actual cause of the trend, the 
dependence should disappear after taking into account the real 
separation between sources (i.e. redshifts). 

4.5. Angular clustering of a hardness ratio selected sample 

If the unified model of AGN is correct and the obscuration is 
due to an orientation effect, the clustering of obscured and un- 
obscured sources should not be significantly different. The ex- 
pected fraction of obscured AGN in our sample is ~50% in the 
2-10 keV band, and -20% in the 0.5-2 keV band (Mateos et al. 
2008 ), and they can be selected by computing their hardness ra- 
tios (HR) 



where S and H are the count rate of the source in the soft and 
hard bands, respectively. A HR value often used in the litera- 
ture to discriminate between obscured and unobscured sources 
is HR = -0.2, which approximately corresponds to a source 
with an obscuring column density Nh = 10 22 cirT 2 and an in- 
trinsic power-law photon index of 1 .7 at redshift z = 0.7 (see 
e.g. Gandhi et al. 12004] ) . 

We have applied this criterion to our sources thus creating 
two subsamples with HR > -0.2 and HR < -0.2 in the soft and 
hard bands, and studied their angular clustering properties. The 
best-fit parameters are reported in Table |4] along with their lcr 
uncertainties. The results are plotted in Figure|6] 
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Fig. 6. Angular correlation function for the soft HR > -0.2 {upper left panel), soft HR < -0.2 {upper right panel), hard HR > -0.2 
{lower left panel), and hard HR < -0.2 {lower right panel) sources. Solid dots are the observed data. Solid triangles represent the 
average random estimation of w{0) used to calculate the integral constraint (see text). Overplotted is the best-fit^ 2 with and without 
fixed slope (dashed and solid lines, respectively). 



The results show that the clustering properties of HR > -0.2 
sources, which are likely to be absorbed, are not significantly 
different from that of the HR < -0.2 sources (well within the 
error bars) in both the soft and hard bands. This is in contra- 
diction with the results of Gandhi et al. ((2006) who found that 
sources in the XMM-Newton-LSS survey with HR > -0.2 in 
the 2-10 keV band showed clustering evidence whereas sources 
with HR < -0.2 did not. They were hugely affected by low- 
count statistics, however, and their angular clustering detections 
were marginal. 



On the other hand, our results are in agreement with those of 
Gilli et al. (2005J and (|20O9), who computed the spatial corre- 
lation function for obscured and unobscured sources in the CDF 
and COSMOS fields, respectively. They were unable to find sig- 
nificant evidence of different clustering behaviours between both 
subsets of sources. Moreover, since obscured AGN are typically 
detected at low redshifts (z < 1) in medium-depth surveys, this 
could also mean that sources above and below redshift 1, when 
AGN reached a maximum in comoving density in the Universe 
according to recent luminosity function models, do not cluster 
differently. 



Table 4. Summary of the fits of the hardness-ratio selected sub- 
samples. 



Band (keV) 




*y sou 


00 

(arcsec) 


7-1 


Soft (0.5-2) HR > 


-0.2 


4038 


97 1 +14-5 
i/,J --17.7 

8.3±0.5 


i 17+0.26 
1 • 1 ' -0.35 
0.8 (fixed) 


Soft (0.5-2) HR < 


-0.2 


27791 


24. S +2 2 6 7 
7.5±0.1 


1.17±0.05 
0.8 (fixed) 


Hard (2-10) HR > 


-0.2 


3425 


33 r 13 - 3 

JJ '-17.! 

6.8+0.5 


1 ^17+0.28 
J -0.36 

0.8 (fixed) 


Hard (2-10) HR < 


-0.2 


6006 


6.2±0.3 


1 97+0.17 
1 - z ''-0.20 

0.8 (fixed) 



5. Spatial clustering 

5.1. Inversion of Limber's equation 

The two-dimensional angular correlation function is a projection 
in the sky of the real three-dimensional spatial correlation func- 
tion £(r) along the line of sight, where r is the physical separation 
between sources (typically in units of hr x Mpc). 
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Fig. 7. The redshift selection function for the soft (solid line), 
hard {dashed line) and ultrahard (dot-dashed line) bands. 



We can model the spatial correlation function as (De Zotti et 
al. [1990b 



ar.;i=|-l (1 +:)■-' Vm 



(12) 



where e parameterizes the type of clustering evolution. For in- 
stance, if e = y - 3, the clustering is constant in comoving coor- 
dinates, which means that the amplitude of the correlation func- 
tion remains fixed with redshift in comoving coordinates as the 
pair of sources expands together with the Universe. On the other 
hand, if e = -3 the clustering is constant in physical coordinates 
(De Zotti etal. [19901 

The angular amplitude 8q can be related to the spatial ampli- 
tude ro by inverting Limber's integral equation (Peebles 1993). 
In the case of a spatially flat Universe, Limber's equation can be 
expressed as (Basilakos et al. |20051 > 



w(ff) = 2 



J Q °° D 4 c cf> 2 (Dc){(r,z)dD c du 
(jj* Dl<p(D c )dD c ) 2 



(13) 



where <p(D c ) is the selection function (the probability that a 
source at a comoving distance D c is detected in the survey). The 
comoving distance D c is related to the redshift through (Hogg 



D c (z) 



with 



o Jo 



dz' 

E{z'Y 



E(z)= [Q,„(l+z) 3 +0 A ] 



1/2 



(14) 



(15) 



in a spatially flat Universe (Q.^ = 0), and Ho being the Hubble 
constant (Peebles [19931 Hogg [1999l . 

The number of objects in a survey that subtend a solid angle 
Qs in the sky within a redshift shell dz is 



dN , / c \ , 



(16) 



Combining these equations, Limber's equation becomes 

w(0) = 2^ J o (-— j E(z)dz J o t(r,z)du. (17) 

The physical separation between two sources that project an 
angle 6 in the sky can be written as 



1 



1 +z 



( M z + x^) 



2f2.\\l2 



(18) 



under the small angle approximation assumption. If we now 
combine equations [T2l and [T71 we obtain 



Y-l 



— H y 



1 dN\ 2 E(z){\ + zr 3 " e+y 



N dz 



DV\z) 



where 



H 7 = 



iri)r(^) 



IX?) 



■dz, (19) 



(20) 



with F being the gamma function. 

We can now invert equation [19] to obtain ro given an angular 
amplitude Bo, but we also need to determine the source redshift 
distribution dN/dz, which can be estimated from a given lumi- 
nosity function model. We can do this via the equation[T6lwriting 
the selection function (degradation of sampling as a function of 
the distance in flux-limited surveys) as 



<p(Dc) 



®(L x ,z)dL x 



(21) 



which depends on the evolution of the source luminosity func- 
tion 0(L X , z) but is independent of the cosmological model. 

We have determined the redshift selection function (see 
Figure[7]l for our sample using the best-fit luminosity-dependent 
density evolution (LDDE) model of the X-ray luminosity func- 
tion of Ebrero et al. (2009) for all three bands (soft, hard and 
ultrahard). Our results on the inversion of Limber's equation for 
different values of the clustering evolution parameter e are listed 
in Table [5] along with the median of the redshift distribution in 
each band. The angular clustering parameters 8o and y used are 
those of Table [2] corrected from the integral constraint. The 1 <x 
errors of ro have been computed assuming fixed y and e. 

Although it is an useful tool to estimate the spatial clustering, 
the inversion of Limber's equation has some limitations since a 
number of assumptions has to be done. The spatial correlation 
ro obtained this way is therefore affected by uncertainties com- 
ing from the determination of 0q and y, the type of clustering 
evolution e, and the redshift selection function. In this context, 
we can assume that the values of the angular correlation func- 
tion Go and y obtained in this work have been accurately deter- 
mined thanks to the large statistics involved (at least in the soft 
and hard bands). The parameter e is hardly known in any system 
and therefore all our results will be reported for both clustering 
models. 

A critical issue for the inversion of Limber's equation is the 
redshift selection function N~ l dN/dz- The luminosity function 
model from which it is derived was computed using AGN from 
a variety of public surveys ranging from deep-pencil beam to 
shallow wide surveys, and the intrinsic absorption was taken into 
account at hard X-rays (Ebrero et al. 2009). Hence, we can as- 
sume that the redshift selection function is robust enough within 
the redshift range in which it was computed (z ~ - 3). 
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Table 5. Spatial correlation lengths ro for different clustering 
models (e) obtained from Limber's equation. Errors are lcr. 



Band (label) 


y 


6 


r (h- [ 


Mpc) 


z 


Soft (SI) 


2.12 


-0.88 


12.25 : 


± 0.12 


0.96 


Soft fS2} 


2.12 


-3 


6.54 =1 


= 0.06 


0.96 


Soft (S3) 


1.80 


-1.20 


13.74 


± 0.14 


0.96 


Soft (S4) 


1.80 


-3 


7.20 d 


= 0.07 


0.96 


Hard (HI) 


2.33 


-0.67 


9.9 d 


= 2.4 


0.94 


Hard (H2) 


2.33 


-3 


5.7 i 


= 1.4 


0.94 


Hard (H3) 


1.80 


-1.20 


12.7 


±0.5 


0.94 


Hard (H4) 


1.80 


-3 


6.8 d 


= 0.3 


0.94 


Ultrahard (Ul) 


2.47 


-0.53 


7.0 d 


= 5.5 


0.77 


Ultrahard (U2) 


2.47 


-3 


5.1 d 


= 4.1 


0.77 


Ultrahard (U3) 


1.80 


-1.20 


11.6 


+ 1.7 


0.77 


Ultrahard (U4) 


1.80 


-3 


7.4 d 


= 1.1 


0.77 



Our results in the soft band assuming clustering constant 
in physical coordinates (e = -3) are in good agreement with 
those obtained from optically selected AGN surveys ro - 5.4 - 
8.6/T 1 Mpc (Akylas et al. 120001 Croom et al. |2fJ02l Grazian 
et al. I2004t and X-ray selected surveys such as Mullis et al. 
d2004l > (r = 7.4+J^/r 1 Mpc) or Basilakos et al. j2Q05l > (r = 
7.5 ± 0.6h~ l Mpc). However, when we assume evolution con- 
stant in comoving coordinates (e = y - 3) the result of Basilakos 
et al. ( 120051 1 is slightly larger (at 2<x confidence level) while 
the predictions of Miyaji et al. (120071 1 are systematically smaller 
(r 9.8 ± 0.7/T 1 Mpc). 

In the hard band, our deprojected spatial amplitude is sig- 
nificantly smaller than the value from Basilakos et al. (2004) 
who obtained ro — 12 - 19/z 1 Mpc using hard sources from the 
XMM-Newton-2dF survey for a fixed canonical slope y — 1.8. 
However, they find that their correlation lengths are much larger 
(over a factor 2) than the ones provided in the literature for AGN 
(Croom et al. I200TT) or 2dF (Hawkins et al. 120031 and SDSS 
(Budavari et al. 2003) galaxy distributions. In fact, the ro values 
obtained by Basilakos et al. (2004) can be compared instead to 
that of extremely red objects (EROs) and luminous radio sources 
(Roche et al. |20T)3"l Overzier et al. 120031 Rottgering et al. 120031 
which are in the range ro - 12 - I5h~ l Mpc. On the other hand, 
the results of Basilakos et al. (2004) for a best-fit slope y — 2.2 
are consistent with ours within the error bars, which may imply 
that the fixed slope representation is not a good characterisation 
for the large-scale clustering of AGN. 

Gilli et al. (2005) performed a broadband spatial clustering 
analysis of the X-ray sources detected in the Chandra Deep Field 
North and South obtaining correlation lengths in the range ro = 
5 - 10h~ l Mpc, roughly in agreement with our predictions within 
the error bars, albeit with a much flatter slope y ~ 1 .4. More 
recently, Gilli et al. (2009) studied the spatial clustering of AGN 
in the COSMOS field finding r = 8.65^ ^/T 1 Mpc and y = 
1.88. 

In the context of a comoving clustering scenario e = y — 3, 
these results in the soft and hard bands seem to confirm the dif- 
ference in the spatial clustering between X-ray selected and op- 
tically selected AGN reported in other works. Our large correla- 
tion length ro > lO/i -1 Mpc is in general consistent with that of 
X-ray selecte d AGN (Basilakos et al. |2004l Basilakos et al. 120031 
Puccetti et al. 2006), while reported correlation lengths from op- 
tically selected AGN are significantly shorter (ro - 5/z~' Mpc, 
Croom et al. [2002 ). The situation turns when we assume clus- 



tering in physical coordinates e = -3, being our computed ro 
consistent with the values of Croom et al. ([2002). 

Our results in the ultrahard band are in good agreement 
(within the error bars) with the deprojected spatial clustering cal- 
culated by Miyaji et al. d20071 l using COSMOS sources detected 
in the 4.5-10 keV range. It must be taken into account, however, 
that the 0q values we used to invert Limber's equation in this 
band correspond only to a marginal detection of clustering. 

5.2. Dependence on the X-ray luminosity and redshift 

The results obtained in Section l4~4l showed that the angular clus- 
tering strength seemed to depend on the flux limit of the sample 
under consideration. Since there are several combinations of red- 
shifts and luminosities that yield to a given flux, we have inves- 
tigated whether the deprojected spatial length ro depends on the 
X-ray luminosity and redshifts. Moreover, this would provide 
additional information on the evolution of the X-ray sources. 

Since flux and luminosity are related to each other through 
the luminosity distance S - L/4ndj(z) and 9q oc r y J y 1 (via 
Limber's equation), if we assume that the 8q - S relation de- 
scribed in Section |4~4l flo oc S a is generally true, we would expect 
a spatial correlation length dependent on luminosity and redshift 
in the form r oc (L/d 2 L (z)) a(r ' l)/r . 

We have therefore inverted Limber's equation for different 
flux-limited subsamples (see Section l4~4l >. The median redshifts 
and luminosities have been derived from the LDDE best-fit lu- 
minosity function of Ebrero et al. (2009). In Figure[8]we plot the 
deprojected correlation distances ro as a function of the median 
luminosity and redshift of the different subsamples for different 
clustering models. 

We find no significant dependence of the clustering strength 
neither on the luminosity nor on the redshift median ranges 
spanned by our sample. In a comoving clustering scenario, there 
seem to be no dependence on the median luminosity in neither 
the soft and hard bands (dots and triangles in the left panel of 
Figure [8] respectively). However, when we consider e = -3 a 
slightly positive trend is observed in both bands (squares and 
crosses, respectively, same panel). In order to check out the sig- 
nificance of this dependency, we fitted both a constant value and 
a constant plus a linear term to the points in Figure [8] and com- 
pared their goodnesses of fit. The F-test results show no signif- 
icant improvement of the fit for neither model and therefore we 
conclude that no significant dependence of ro on luminosity is 
found in our data. On the other hand, Plionis et al. (|2()08 ) found 
indications for a luminosity dependent clustering in the Chandra 
Deep Fields, although their large error bars make this conclu- 
sion rather uncertain. Using our much more constrained best-fit 
parameters we are unable to confirm such dependency. 

Similarly, we do not see evolution on the clustering proper- 
ties of our sources, although a mild dependence could be present 
as ro slightly decreases as the median redshift increases. The 
spatial clustering analysis of the Chandra Deep Field (Gilli et 
al. riQOBT i and COSMOS sources (Gilli et al. l2059l led to similar 
conclusions. 



5.3. Bias parameter and connection to dark matter haloes 

The spatial clustering values obtained in section lBTTl can be used 
to estimate the mass of the dark matter haloes (DMH) in which 
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Fig. 8. Deprojected spatial correlation length ro as a function of the median luminosity (left panel) and redshift [right panel). The 
results are shown for different clustering models e = -1.2 (soft band: dots; hard band: triangles) and e = — 3 (soft: squares; hard: 
crosses). 



these sources are embedded. A commonly used quantity for such 
an analysis is the bias parameter, that is usually defined as 



Table 6. Bias and dark matter halo mass. Errors are lcr. 



,2, N €agn(&,Z) 
b (z) = 



^dmh(8,z)' 



(22) 



where £agn(8,z) and £dmh(8,z) are the spatial correlation func- 
tions of AGN and DMH evaluated at 8 lr x Mpc, respectively. 
The former value has been calculated in this work whereas the 
latter can be estimaded using (Peebles 1980) 



(23) 



where J 2 = 72/ [(3 - y)(4 - y)(6 - y)2 y ] and erf (z) is the dark 
matter density variance in a sphere with a comoving radius of 8 
Mpc which evolves as 



<T 8 (z) = <T 8 D(z). 



(24) 



D(z) is the linear growth factor of perturbations, which is de- 
fined as DEds(z) = (1 + z) _I in an Einstein-De Sitter cosmology. 
However, in the context of a A-CDM cosmology the growth of 
perturbations is weaker and it is attenuated by a suppression fac- 
tor g(z) so that D(z) = g(z)(l + z) _1 . We have used here the ana- 
lytical approximation for g(z) described in Carroll et al. ( 1992). 
On the other hand, we have fixed the rms dark matter fluctuation 
at present time <x 8 to 0.84 (Spergel et al. 12003) . 

The CDM structure formation scenario predicts that the bias 
parameter is determined by the dark matter halo mass (Mo & 
White 11996) . We have used the large-scale bias relation as a 
function of halo mass reported in Sheth et al. (12001) 



|av 2 -\/a + 0.5 -\/a(av 2 ) 



2\l-c 



(«v 2 f 



(25) 



(av 2 )'+0.5(l-c)(l-c/2)J 



where v = S c (z)/<t(M,z), a = 0.707, and c = 0.6. 6 C is the criti- 
cal overdensity for collapse of a homogeneous spherical pertur- 
bation, and it takes the value of ^ 1 .69 in an Einstein-De Sitter 
cosmology. For a general cosmology, 6 C posseses a weak depen- 
dence on redshift as reported in Navarro et al. ( 11997) . <x(M, z) is 
the rms density fluctuation in the linear density field that evolves 



cr(M,z) = o-(M)D(z) 



(26) 



Band (label)" 




b 


\ogM DMH b 


Soft (SI) 
Soft (S2) 
Soft (S3) 
Soft (S4) 


2.52±0.10 
1.30±0.03 
2.22±0.01 
1.24±0.01 


4.82+0.18 
2.48+0.07 
4.24+0.02 
2.37+0.01 


12.76+0.31 
12.50+0.31 
12.71+0.30 
12.49+0.30 


Hard (HI) 
Hard (H2) 
Hard (H3) 
Hard (H4) 


2.39+0.56 
1.26±0.27 
2.07±0.06 
1.18+0.03 


4.53+1.06 
2.38+0.51 
3.92+0.11 
2.23+0.06 


12.74+0.34 
12.49+0.34 
12.69+0.31 
12.47+0.31 


Ultrahard (Ul) 
Ultrahard (U2) 
Ultrahard (U3) 
Ultrahard (U4) 


1.81+1.70 
1.22+1.08 
1.91+0.15 
1.27+0.09 


3.16+2.97 
2.14+1.88 
3.34+0.26 
2.23+0.16 


12.66+0.44 
12.50+0.44 
12.68+0.32 
12.52+0.32 


" Labels are those of the fits in 
* In units of h~ l Mq 


Table |5J 





where cr(M) is given by the convolution of a power spectrum 
P(k) with a top-hat window function w (k), 



cr 2 (M) = i- f k 2 P(k)w 2 (k)dk. 
2n 2 Jo 



(27) 



For a power-law power spectrum P(k) oc k'\ the rms fluctuation 
on mass is 



■ M Y 



(n+3)/6 



(28) 



where M 8 is the characteristic mean mass within 8 h~ l Mpc (see 
e.g. Martini & Weinberg |2001) . 

The results for the bias parameters and estimated DMH 
masses for the different clustering models in the soft, hard 
and ultrahard bands are reported in Table [6] We find an aver- 
age (log Mdmh ) = 12.50 + 0.34 h~ l M and {\ogM DMH ) = 
12.71 + 0.34 hT l M for the clustering models e = — 3 and 
e = y — 3, respectively. This is in excellent agreement with the 
results from the AERQS survey (Grazian et al. 2004) or the 2dF 
survey (Porciani et al. 120041 Croom et al. 2005 ) and, more re- 
cently, the COSMOS survey (Gilli et al. 120091) . 
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Our derived bias parameters b are in excellent agreement 
with those reported in Basilakos et al. (2008 ), who found b(z 
1 .2) = 4.88 ± 1.20 and b(z = 0.85) = 4.65 ± 1.50 in the soft 
and hard bands, respectively. In general, X-ray selected AGN 
show larger bias parameters than optically selected AGN (e.g. 
Croom et al. 120051 Myers et al. 120071 1. This could mean that 
the underlying matter distribution is traced differently in X-rays 
and in the optical domain, with X-ray selected AGN residing in 
more massive DMH than the optically selected AGN. A num- 
ber of works (Porciani et al. 2004, Croom et al. 2005 Basilakos 
et al. 120081 1 show that optical AGN are likely to be hosted by 
DMH with masses < 10" 13 hr x M Q , while X-ray AGN are 
usually embedded in DMH with masses > 10~ 13 h~ x M . Our 
results, however, are slightly lower but consistent nevertheless 
with the estimations from the spatial clustering of COSMOS 
sources (M DMH ~ 12.4-12.8 h~ l M , Gilli et al. 2009). We must 
stress that our results are derived from a mostly unidentified X- 
ray sample which has been selected so that the vast majority of 
the sources are likely to be AGN, although we can expect some 
pollution coming from Galactic stars and passive galaxies. We 
estimate the population of non-AGN sources in our sample to be 
of the order of ~1Q% (see e.g. Barcons et al. |20071 >. 

It is possible to trace the bias evolution with redshift using 
the relations described above. At redshifts up to ~1, where the 
median of the redshift distribution of our sample is expected to 
lie, a simple model to describe the bias evolution is the so-called 
conserving model (Nusser & Davis 1994] Fry |1996t , 



Table 7. Predicted bias at z - 0. 



b(z) = l + (b - 1)/D(z), 



(29) 



where bo is the population bias at z = 0. This model assumes that 
the objects, after being formed at a given high-redshift epoch, 
evolve with time within the gravitational potential. We have 
computed the bias parameter for several subsamples with dif- 
ferent median redshifts in the soft band, finding that the present- 
time bias bo strongly depends on the clustering model. For in- 
stance, for the e = y— 3 model (fits SI, S3, HI and H3), bo lies in 
the range 2.5-3.0, whereas for the e = -3 model (fits S2, S4, H2, 
H4) we obtained bo ~ 1.75 (see Tableland Figure[9]). These re- 
sults were expected since the e — -3 model removes the redshift 
dependence in equation[T2]and hence produces lower correlation 
lengths with respect to the e — y - 3 model. Similar results for 
bo were obtained by Basilakos et al. (120051) . who also studied 
the bias evolution for both clustering models. Gilli et al. (2009 ) 
found bo values in the range 1.5-2, similar to that of Croom et al. 
(2005 ), consistent with our predictions for a e = -3 clustering 
scenario. 

5.4. The lifetime of AGN 

We can estimate the lifetime of AGN using the mean DMH 
masses calculated above and making some simple assumptions. 
We have followed the method proposed by Martini & Weinberg 
(2001), assuming that we are sampling the most massive DMH 
at a given redshift z and that each DMH hosts an active AGN at 
any given time. There is hence a relation between the comoving 
density of AGN (D(z) and their lifetime tACN(z) in the form 



a>(z) 



£ 



tAGN(z) 
tDMH(M,z) 



n(M, z)dM, 



where M mi „ is the minimum halo mass hosting an AGN, n(M, z) 
is the comoving density of DMH of mass M at redshift z, and 
tDMH(M, z) is the lifetime of DMH. 



Band (label) 




6 


bo 


Soft (SI) 


y 


- 3 


2.95 ± 


0.09 


Soft (S2) 




-3 


1.76 ± 


0.04 


Soft (S3) 


y 


-3 


2.65 ± 


0.01 


Soft (S4) 




-3 


1.71 ± 


0.01 


Hard (HI) 


y 


-3 


2.80 ± 


0.02 


Hard (H2) 




-3 


1.75 ± 


0.25 


Hard (H3) 


y 


-3 


2.53 + 


0.04 


Hard (H4) 




-3 


1.72 ± 


0.02 



According to Martini & Weinberg (2001), the characteris- 
tic halo timelife is defined as the time interval during which a 
DMH of mass M at redshift z is incorporated to a larger halo 
of mass 2M. To a first approximation, we can assume that the 
halo lifetime is comparable to the Hubble time at that redshift, 
Idmh(M, z) ~ tu(z). Equationl30lthen yields 



tAGN(z) = tu(z) 



4>(z) 



DMH 



(Z) 



where 
<&dmh(z) 



n 

•JM,,,:,, 



(M,z)dM 



(31) 



(32) 



is the comoving density of DMH with mass above M„„„. 

We can estimate $>dmh(z) following the Press-Schechter ap- 
proximation as described in Martini & Weinberg (2001), obtain- 
ing a DMH comoving space density at z = 1 (approximately the 
median of the redshift distribution of our sample) for haloes with 
mass larger than log M mi „ = 12.6 hr l M Q (the average halo mass 
estimated in section IBT31 of <b DM H - 2 x 10~ 3 /i 3 Mpc~ 3 . 

For the comoving density of AGN, we have used the pre- 
dicted value from the X-ray luminosity function of Ebrero et 
al. ([2009) that we used to deproject Limber's equation, at the 
median luminosity of our sample and z = 1. This yielded to 
an AGN duty cycle in the range t ACN ltu = 0.054 - 0.078. 
In our cosmological framework, the Hubble time at z — 1 is 
~5.8 Gyr. The estimated lifetime of AGN is hence in the range 
t AGN = 3.1 -4.5 x 10 8 yr. 

However, the assumptions made in the Martini & Weinberg 
(2001) approximation (i.e. AGN activity is a random event in 
the lifetime of a halo) might not be valid at redshifts below z < 
2, where the AGN space density begins to decline and hence 
fuelling mechanisms may trigger AGN activity rather than black 
hole growth, thus dominating clustering properties. 

The estimated lifetime derived above corresponds to the total 
activity period for a single AGN, which can be split into several 
episodes of activity. The significantly shorter lifetime compared 
to the time period spanned between redshift ~1 and indicates 
that we are probably observing several generations of AGN, and 
that an important fraction of galaxies might experience AGN ac- 
tivity one or more times throughout their lifes. 



(30) 6. Conclusions 



We have studied the angular correlation function of a large sam- 
ple of serendipitous X-ray sources from 1063 XMM-Newton 
observations at high Galactic latitudes in several energy bands: 
0.5-2 (soft), 2-10 (hard) and 4.5-10 (ultrahard) keV. Our sample 
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Fig. 9. Bias parameter as a function of redshift for different bands and clustering models. Left panel: Clustering model e = y - 3 fits 
in the soft (SI and S3, dots and triangles, respectively), and hard (HI and H3, squares and crosses, respectively) bands. Right panel: 
Clustering model e = — 3 fits in the soft (S2 and S4, dots and triangles, respectively), and hard (H2 and H4, squares and crosses, 
respectively) bands. Overplotted are the best fits to the conserving bias evolution model in each band. 



comprises 31288, 9188 sources in the soft and hard bands, re- 
spectively, covering ~ 125.5 deg 2 in the sky, and 1259 sources in 
the ultrahard band over ~5 1 .5 deg 2 , thus being the largest sample 
ever used in clustering investigations. 

We found significant positive angular clustering signal in the 
soft (~10cr) and hard (~5cr) bands, while our results in the ultra- 
hard band are only marginal (<lcr). The result in the hard band 
clears up the debate on whether X-ray sources detected in this 
band cluster or not, since a number of past works had reported 
different inconclusive results ranging from a few <x detections to 
no detection at all. 

We made power-law fits to the angular correlation func- 
tion taking into account correlations between errors in the 
range 50-1000 arcsec, determining the best-parameters with un- 
precedent accuracy. We obtained typical correlation lengths of 
6»o =22.9±2.0 (soft), 6 =29.2^ ' (hard), and 6 =40.9+^J (ul- 
trahard), and slopes y-l = of 1.12+0.04, 1.33+° ]°, and 1.47+°^ 
in the soft, hard and ultrahard bands, respectively. An angu- 
lar clustering characterisation with a fixed canonical slope of 
y-l = 1.8, typical value found for nearby galaxies, does not 
reproduce well the observed data. 

Previous angular clustering studies reported that the clus- 
tering strength might depend on the flux limit of the sample 
(Giacconi et al. 1200 II Plionis et al. 120081 . Indeed, after splitting 
our sample into several subsamples at different flux limits we 
found a dependency of 6q on the flux limit. One possible expla- 
nation for this behaviour is that different flux limits effectively 
sample different source populations, thus reflecting an underly- 
ing dependence of the clustering properties on redshift. 

We have also studied the angular clustering of hardness-ratio 
selected subsamples in the soft and hards, finding that the clus- 
tering properties of sources with HR > -0.2 are not significantly 
different to that of HR < -0.2 sources. Since the former are 
likely to be absorbed AGN, this may provide support to unifica- 
tion theories, in which obscuration is due to an orientation effect 
and has nothing to do with the large-scale clustering whatsoever. 
Other works (e.g. Gilli et al. 2005] 12009} have also failed to find 
significant differentes in the spatial clustering of absorbed and 
unabsorbed AGN. 



We inverted Limber's equation, assuming a given redshift 
distribution for our sources, in order to estimate typical spa- 
tial correlation lengths. We found values of ro =12.25 + 0.12, 
9.9 ± 2.4, and 7.0 + 5.5 ft -1 Mpc in the soft, hard and ultrahard 
bands, respectively, for a clustering model constant in comoving 
coordinates, while for clustering constant in physical coordinates 
we obtained r =6.54 ± 0.06, 5.7 ± 1.4, and 5.1 ± 4.1 ft -1 Mpc, 
respectively. 

Inverting Limber's equation for different flux-limited sub- 
samples reveals no dependence of the typical deprojected spatial 
length neither on the median luminosity nor on the median red- 
shift of the samples. A slightly positive trend might be observed 
when assuming ae = -3 clustering, although it does not provide 
a significantly better fit compared with a constant model, accord- 
ing to the F-test. These results appear to be in contradiction with 
those of Plionis et al. (2008) but are in agreement with those of 
Gilli et al. (120051 1 and (2009). Moreover, this could mean that 
the ^o-flux limit dependency discussed in Section [4~4l might be 
caused by the fact that pairs of sources tend to appear closer in 
deep surveys with fainter flux limits. 

We used these values to calculate the rms fluctuations of the 
AGN distributions within a sphere of radius 8 hr x Mpc, and com- 
pared them with that of the underlying mass distribution from the 
linear theory in order to estimate the bias parameter of our X-ray 
sources. We obtained values ranging from ~2 to ~4.8 in the red- 
shift interval 0.5 < z <1. The bias depends on the mass of the 
dark matter haloes (DMH) that host the AGN population. From 
the computed bias values we have estimated a typical DMH mass 
of (log M DMH ) 12.60 ± 0.34 /T 1 M G . 

The typical AGN lifetime derived from the Press-Schechter 
approximation at redshift z ~ 1 lies in the range in the range 
tACN = 3.1 - 4.5 x 10 s yr. This interval is significantly shorter 
than the time span between that redshift and the present thus sug- 
gesting the existence of many AGN generations, and that a sig- 
nificant fraction of galaxies may switch from a quiescent phase 
to AGN activity, and vice versa, several times throughout their 
lifes. 
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